Parallelization of the TD(λ) Learning Algorithm

نویسندگان

Odalric-Ambrym Maillard

Philippe Preux

چکیده

Running the TD(λ) algorithm for complex simulated tasks may take several days of computation time. In order to reduce this time, it is possible to take advantage of parallel computation architectures. In this paper, we present a method to parallelize the TD(λ) algorithm. This method consists in running episodes in parallel, and summing the weight changes obtained by each processor at some synchronization points. We present experimental results on motor-control tasks of varying complexity, and a theoretical bound on parallelization error.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Empirical Evaluation of True Online TD({\lambda})

The true online TD(λ) algorithm has recently been proposed (van Seijen and Sutton, 2014) as a universal replacement for the popular TD(λ) algorithm, in temporal-difference learning and reinforcement learning. True online TD(λ) has better theoretical properties than conventional TD(λ), and the expectation is that it also results in faster learning. In this paper, we put this hypothesis to the te...

متن کامل

TD(λ) Networks: Temporal-Difference Networks with Eligibility Traces

Temporal-difference (TD) networks have been introduced as a formalism for expressing and learning grounded world knowledge in a predictive form (Sutton & Tanner, 2005). Like conventional TD(0) methods, the learning algorithm for TD networks uses 1-step backups to train prediction units about future events. In conventional TD learning, the TD(λ) algorithm is often used to do more general multi-s...

متن کامل

Efficient Reinforcement Learning Using Recursive Least-Squares Methods

The recursive least-squares (RLS) algorithm is one of the most well-known algorithms used in adaptive filtering, system identification and adaptive control. Its popularity is mainly due to its fast convergence speed, which is considered to be optimal in practice. In this paper, RLS methods are used to solve reinforcement learning problems, where two new reinforcement learning algorithms using l...

متن کامل

Kernel Least-Squares Temporal Difference Learning

Kernel methods have attracted many research interests recently since by utilizing Mercer kernels, non-linear and non-parametric versions of conventional supervised or unsupervised learning algorithms can be implemented and usually better generalization abilities can be obtained. However, kernel methods in reinforcement learning have not been popularly studied in the literature. In this paper, w...

متن کامل

True Online TD(λ)

TD(λ) is a core algorithm of modern reinforcement learning. Its appeal comes from its equivalence to a clear and conceptually simple forward view, and the fact that it can be implemented online in an inexpensive manner. However, the equivalence between TD(λ) and the forward view is exact only for the off-line version of the algorithm (in which updates are made only at the end of each episode). ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

Parallelization of the TD(λ) Learning Algorithm

نویسندگان

چکیده

منابع مشابه

An Empirical Evaluation of True Online TD({\lambda})

TD(λ) Networks: Temporal-Difference Networks with Eligibility Traces

Efficient Reinforcement Learning Using Recursive Least-Squares Methods

Kernel Least-Squares Temporal Difference Learning

True Online TD(λ)

عنوان ژورنال:

اشتراک گذاری